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Abstract 



Recent technological advances now make possible the full integration of sound in 
instructional software. Sounds may gain and focus learner attention, reduce distracting stimuli, 
and make learning more engaging. In addition, they may help learners condense, elaborate upon, 
and organize details, highlighting interconnections among new pieces of information and making 
connections to pre-existing knowledge. Thus, sound may hold great promise for moderating 
acquisition, processing, and retrieval “noise” in instructional software. Unfortunately, interface 
and instructional design guides almost completely ignore sound, and research suggests many 
promising instructional uses remain largely unexplored. This paper a) describes current practice; 
b) explores information-processing and communication theoretical foundations for sound’s 
systematic use; and c) proposes a framework for sound’s use in instructional software. 
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The Problem 

Many recently published educational software programs do not use sound very 
extensively. In fact, some use no sound at all (see for example, Inspiration, 1997; Interactive 
Physics, 1996). Sound in educational software often is only an attention-getting device, such as 
introductory theme music ( History of Music, 1995; Mathematica for Students, 1996) or a system- 
generated “beep” to alert learners to invalid clicks ( Divide and Conquer, 1994; The Geometer’s 
Sketchpad, 1996). Reinforcing sound effects, such as “bells,” and corrective sound effects, such 
as “buzzers,” are somewhat less widely used to accompany content-based feedback messages, 
particularly in knowledge-assessment items (for example, BioLab * Fly, 1998; The Rosetta 
Stone, 1996). 

A few instructional applications use sound to consolidate, elaborate upon, and organize 
information. Pierian Software’s products, for example, use musical themes to help learners 
differentiate among activities and screen types within the lesson ( CampOS Math, 1996; An 
Odyssey of Discovery: Exploring Numbers, 1998; An Odyssey of Discovery: Skills for Writers, 
1997). And a very few titles, like Star Wars Droidworks (1998) and Nile Passage to Egypt 
(1995), use sound to help users build metaphors for the way the software works. That said, even 
these audio-rich products fail to integrate environmental sounds, background music, and 
speaking characters into the educational portions of their applications to assist learning. In fact, it 
seems that when sound is used to help supply content information, it is almost exclusively word- 
for-word narration of screen text (see A.D.A.M.: The Inside Story, 1996; Eyewitness 
Encyclopedia History of the World, 1996). 

In sharp contrast, computer game developers have been aggressively integrating sound 
into their applications for some time. When Creative Labs’ SoundBlaster card was first released 
in 1990, it came equipped with a joystick port and was bundled with several audio-enhanced 
games, suggesting the close link between sound and gaming (Creative Labs, 1998). Advanced 
audio products like the 3D Blaster PCI card, the 3D Audio Developer’s Kit, and Environmental 
Audio extensions primarily are marketed to and adopted by three-dimensional adventure game 
developers like Electronic Arts ( SimCity 3000, 1999), Activision (HeavyGear II, 1999; Interstate 
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’82, 1999), DreamWorks Interactive ( Trespasser , 1998), and MicroProse ( Ultimate Race Pro, 
1998). Games like Myst (1994), Doom (1994), and The 7th Guest (1992) employ large audio 
production teams to design high-quality, environmental sound effects that are used extensively. 
Lucas Arts’ The Dig (1995) combines environmental sound effects with eerie Wagnerian 
background music to make the interface believable and to arouse emotions. Berkeley Systems’ 
You Don’t Know Jack games (1995, 1996, 1997, 1998) rely almost entirely upon snappy, quick- 
witted, and occasionally randomly generated speech. These and other computer games 
incorporate sound comprehensively to enhance a sense of immersion. Why then isn’t sound 
being used as extensively in instructional software to enhance learning? 

Barriers to the Use of Sound in Instructional Software 

Obstacles to the use of sound in instructional software may be divided into three classes: 
previous technological barriers, designers’ preconceptions, and limited theoretical and 
conceptual understandings. Each is discussed below. 

Previous Technological Barriers 

Instructional technologists, those involved in the design and development of learning 
resources, have sought ways to use sound in computer programs for years. In the early 1960s, for 
example, student terminals connected to the mainframe-based IBM 1500 Tutorial System 
employed reel-to-reel tape players (Bennion & Schneider, 1975). Lengthy fast-forwarding and 
rewinding delays caused by the attached tape player’s linearity, however, relegated sound’s use 
to self-contained primary examples or very specific and brief, attention-getting narrative cues 
(Dale, 1969). In the mid-1980s videodisc players that provided “random access” to audio and 
video recordings became fairly widely available in the schools (Technology Milestones, 1997). 
While this meant that desired audio or video segments could be played back after only a small 
delay, the analog signal format used isolated presentation on a separate television monitor rather 
than being integrated with the computer interface. Digital overlay boards developed in the late- 
1980s to translate videodisc signals from analog to digital formats only partially solved the 
problem; audio and video segments still often were operated using “player” software that was 
separate from the instructional software. By the late 1980s and early ’90s, computerized 
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instruction written for computer-driven multimedia configurations typically involved a lot of 
reading on the computer screen that was supplemented, if the user chose, by clicking to view a 
separate visual or audio presentation (see for example, The Adventures of Jasper Woodbury, 
1988-1992; The Great Solar System Rescue, 1992; Interactive Nova, 1990; Introduction to 
Economics, 1986; The Living Textbook, 1990). These applications often relied heavily upon the 
user’s ability and desire to explore the available media, not upon the software’s own dynamic 
presentation of integrated information types (Gygi, 1990; Mayes, 1992). 

Until 1990, when Creative Labs introduced their relatively inexpensive SoundBlaster 
sound card, IBM-compatible (PC) systems had no audio output capabilities beyond their attached 
peripheral devices and the “beep” of their limited internal speakers. And while Apple computers 
had boasted at least 8-bit mono sound output capability since the Macintosh was introduced in 
1984, no microcomputer — PC or Macintosh — came equipped with standard integrated sound 
recording capabilities until the Mac LC was released in October 1990 (Sanford, 1998). Since 
1990, accelerating microprocessor speeds and plummeting equipment prices have meant that PC 
and Macintosh systems under $1500 now have enough computing power and built-in audio 
technology to handle high-quality digital sound (Dell Computer, 2000; MacConnection, 2000). 
Further, as computing speed and power have improved, advances in digital sound production 
techniques have led to more accessible, higher-quality sound files that require less and less 
storage space (Long, 1997). But while the technology for integrating sound now may be less 
limited, the way we think about using sound in instructional software may not be. 

Designers’ Preconceptions 

Our preconceptions about the way a tool works can limit the way we think about using it 
to solve problems (Wertheimer, 1959). For example, Maier (1930, 1931) asked participants in a 
series of experiments to tie together two strings that were hanging from the ceiling. Participants 
quickly discovered that while they held onto one of the strings, they could not reach the other. 
The solution was to tie an object to one of the strings and then to swing the now-weighted string 
toward the other. Maier handed participants a pair of pliers, hinting that the tool could be used to 
solve the problem. He found that participants who could not envision the pliers as anything other 
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than a gripping tool could only think to use the tool to extend their reach, an unsuccessful 
approach. 

Similarly, the limitations of an older technology can define the way we think to use a 
new, less limited technology (Divesta & Walls, 1967). Apparently, software designers are not 
immune to this “functional fixedness.” For example, Cates (1998) proposed that screen designs 
based on the impediments faced by early World Wide Web adopters continue to influence 
Webmasters’ concepts of how Web pages should look and function today, despite significant 
advances in the capabilities of Hypertext Markup Language. It seems that once design 
components like text, graphics, and sound have been assigned functions, their roles can become 
“fixed” in the designer’s mind, regardless of advances in the technology. Cooper (1995) 
maintained that years of annoying, internal speaker-generated, corrective feedback “beeps” that 
coldly announce the user’s failure have so stigmatized computer sounds that most developers 
wrongly believe using sound is undesirable and should no longer be considered as part of 
interface design. It is also possible, however, that sound continues to be relegated in the software 
interface to error messages, self-contained examples, and screen-text narration because, as was 
the case with the pliers, few people can see how to use it otherwise. 

Limited Theoretical and Conceptual Understandings 

Unfortunately, software designers seeking theoretical and conceptual direction for using 
sound to help students learn will find that there are very few guidelines available. Few pages in 
recently published interface design books are dedicated to the topic. Microsoft’s authoritative 
556-page book, The Windows Interface Guidelines for Software Design (1995), dedicated just 
over one page to the use of sound in the interface. This appears to match the focus of most other 
currently available interface design books (see Bickford, 1997; Galitz, 1997; Mandel, 1997; 
Preece, Rogers, Sharp, Benyon, Holland, & Carey, 1996). Even new editions of older books 
— like the third edition of Shneiderman’s 638-page Designing the User Interface: Strategies for 
Effective Human-computer Interaction (1998) — often offer no new guidelines on the use of 
sound not already in the former editions, despite the dramatic advances that have been made in 
computer sound technology in the eleven years since the first edition (Shneiderman, 1987). 
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Further, most of the “classics” of instructional interface design were written before sound 
was a viable design component. Consequently, the instructional software developer looking for 
guidance finds nothing about sound in Jonassen’s 434-page book, Instructional Designs for 
Microcomputer Courseware (1988), and only one short paragraph in Hannafin and Peck’s 379- 
page book, The Design, Development, and Evaluation of Instructional Software (1988). In When 
Machines Teach: Designing Computer Courseware (1987), Keller’s only advice in 196 pages is 
to use sound effects sparingly “to really get attention or show something in particular” (p. 90). 
None of Giardina’s 247 -page Interactive Multimedia Learning Environments: Human Factors 
and Technical Considerations on Design Issues (1991) or Steinberg’s 213-page Computer- 
assisted Instruction: A Synthesis of Theory, Practice, and Technology (1991) discuss sound. 
Ambron and Hooper mentioned sound only anecdotally in Interactive Multimedia: Visions of 
Multimedia for Developers, Educators, and Information Providers (1988) and Learning With 
Interactive Multimedia: Developing and Using Multimedia Tools in Education (1990). While 
Alessi and Trollip added a second page on sound to the 476-page second edition of their 
Computer-based Instruction: Methods and Development (1991), the only sound type considered 
in the section entitled “Types of audio presentation” was speech. The three pages on sound in the 
second edition of Fleming and Levie’s (1993) 304-page book, Instructional Message Design: 
Principles From the Behavioral and Cognitive Sciences , were primarily devoted to a discussion 
of narration. Of the instructional interface design books reviewed, only Schwier and 
Misanchuk’s (1993) Interactive Multimedia Instruction provided some guidelines for the use of 
sound, dedicating five of 347 pages to a section entitled “Designing Audio Segments.” 

Overall, the authors of instructional design guidelines seem to recommend that sound’s 
major function — other than supplying occasional bells and whistles to accompany error and 
feedback messages — should be either to provide stand-alone audio examples (like a musical 
performance or an historical speech), or to narrate screen text. Given this lack of guidelines, what 
direction might research on sound’s use offer? 

Studies investigating sound’s use in computerized instruction have focused almost 
exclusively on digitized or computer-generated synthetic speech narration of content and 
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narration’s effects on achievement, time on task, and attitudes toward the software (see, for 
example, Barron & Atkins, 1994; Mann, 1994, 1995, 1997; Shih & Alessi, 1996). Other than 
these narration studies — and one study on music’s role in instructional software (Hardy & Jost, 
1996) — the authors could find no other studies examining sound’s role in multimedia 
instruction. Thus, while some outside the field of education have considered the use of sound in 
the interface (for example, Blattner, Sumikawa, & Greenberg, 1989; Gaver, 1986, 1989, 1993a, 
1993b, 1993c, 1994), to date there appear to be no in-depth studies designed to discover the 
efficacy of using sound in instructional software. 

Reeves (1991, 1995) has advised those researching the impact of instructional software to 
improve their understanding “bit by bit” by first constructing theory-based models that preserve 
the technology’s dimensional complexity, then collecting and analyzing relevant data using 
methods that illuminate instructional decision-making, thus avoiding the “empirical swamp” of 
media-comparison studies. Salomon (1994) argued that to devise technologies that truly make a 
difference, instructional designers must be supplied with development guidelines that are based 
on the unique ways various communication technologies and media presentations affect the 
learner. Luskin (1997) agreed. He argued that understanding the fundamental components of 
instructional media and the psychology behind them may be the only way to discover their 
pedagogical capabilities, particularly as it becomes increasingly difficult to separate multiple 
media from the computer technologies that allow students to interact with them. Further, without 
a strong theoretical cognitive foundation, the sounds used in instructional software not only may 
not enhance learning, they might detract from learning. 

The Fundamental Nature of the Instructional Communication System 

According to Andre and Phye (1986), learning emerges from processing interactions 
between information from the environment and the learner’s previous experience and 
knowledge. While there has been some debate among psychologists over the exact nature of 
human cognition (see for example, Bigand, 1993; Murdock & Walker, 1969; Peretz, 1993), most 
information-processing theorists have adopted at least the basic structure of the three-stage 
memory model first proposed by Atkinson and Shiffrin in 1968. 
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The Atkinson-Shi ffrin Information-processing Model 
In the Atkinson-Shiffrin model, environmental stimuli in their primitive form are first 
handled by a sensory information store, or sensory register. Signals held here are readily 
displaced by further signals in the same sensory channel. The sensory register filters and then 
routes the incoming signals to a second, short-term store where information is held temporarily 
until it can be encoded for storage. Encoding is the process of building relationships and 
connections within new material or between new material and existing knowledge structures. 
Once encoded, the information is moved into long-term store. Long-term store is both the place 
where we hold newly encoded information and the place from which we retrieve well- 
established memories. Recovering information from long-term store requires cues that may be 
supplied externally by the situation or internally by one’s existing memories. These cues are used 
to search long-term store in order to recognize and to retrieve matches. Control processes 
“oversee” the cognitive system by regulating the exchange of information between the sensory 
register and long-term store, determining which search-and-retrieval strategies to use to access 
information from the long-term store, and deciding when sufficient information is retrieved. 

Some theorists like Tulving (1972) and Paivio (1971) find it useful to divide Atkinson 
and Shiffrin’s long-term store into two components: episodic memory and semantic memory. 
Episodic memory is made up of those things we’ve experienced. We know these things as 
shadowy entities, or images, that we can “see,” “hear,” “feel,” and “smell,” as when one thinks 
about last night’s dinner. Episodic memory is the autobiographical knowledge one has for things 
that have been personally experienced and how those events relate to the already existing 
contents of the episodic memory store. Tulving hypothesized that the semantic memory system 
contains memories that are abstracted from the memory of our original encounter with them so 
that we know them independently and can easily transfer them to other situations. Semantic 
memory is the organized, propositional knowledge, or schema, one has for the meanings, rules, 
and algorithms used to manipulate and understand the many symbol systems encountered in life. 

According to Klatzky (1978) there is an “inextricable relationship” between processing 
and retrieval; distinctions between the two can be somewhat arbitrary. Processing is often 
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defined as the change that takes place within the cognitive system that makes subsequent 
retrieval possible, while retrieval is often described as supplying the interpretive framework that 
makes processing possible (Anderson, 1977; Smyth, Collins, Morris, & Levy, 1994). But Mayer 
(1993) and others (for example, Craik & Lockhart, 1972; Craik & Tulving, 1975; Lockhart & 
Craik, 1978) have suggested that there may be a qualitative difference between 
processing/encoding and retrieval/encoding. When one encodes for short-term non-conceptual 
retention, he argued, one typically consolidates, elaborates upon, organizes, and otherwise 
analyzes or makes internal connections between incoming stimuli. When one encodes for long- 
term conceptual retention, however, one typically ties into existing constructs, builds metaphors, 
expands theories, and otherwise synthesizes or makes external connections between the new 
information and one’s existing knowledge. When material is thus synthesized for retrieval, one’s 
knowledge of the topic supports long-term problem-solving transfer. When external synthesis 
does not occur, internal analysis produces only rote memorization. This less deeply encoded 
information is likely to be discarded or, if stored in the long-term store, is unlikely to be retrieved 
easily. Long-term memories that are difficult to retrieve can cause problems for future encoding. 

Information-processing theorists maintain that learning occurs when information that has 
been transferred to and stored in long-term memory can be retrieved when needed. Phye (1997) 
proposed that transforming incoming environmental stimuli into learned images and schema 
involves three main operations: acquisition, processing, and retrieval. It appears, however, that 
limitations in each of these operations may restrict the amount of data one can consign to long- 
term storage. 

Limitations in Information Processing 

In order to acquire or make sense of the constant barrage of sensory information, an 
individual must decide, often unconsciously, which information to attend to and which to ignore. 
To explain this phenomenon, psychologists like Broadbent (1958), Deutsch and Deutsch (1963), 
Schneider and Shiffrin (1977), and Posner (1980) posited that all information reaching the 
sensory register is subjected simultaneously, or in parallel, to a preliminary analysis based on 
prior knowledge. From this pre-perceptual analysis of the entire sensory scene, one chooses a 
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smaller subset of stimuli to process successively, or in serial, through the rest of the cognitive 
system. The “bottleneck” created between parallel pre-perceptual and serial perceptual stages 
restricts the amount of information entering the cognitive system. Individuals remain essentially 
unaware of information not selected for attention. 

Like many later researchers, Wundt (1896/1897) found that short-term store is also of 
limited capacity: There is a limit to the amount of information, or maximal cognitive load, that 
an individual can process in short-term store at any given time (see also Craik, 1979; Murdock, 
1962; Peterson & Peterson, 1959; Shiffrin, 1993). Although it may be that cognitive load varies 
somewhat, depending upon the nature of the input stimuli, our capacity for processing incoming 
data is certainly limited to some finite quantity. Information that exceeds cognitive processing 
capacity is dropped from short-term store without being processed. Further, unless information 
that enters the store is rehearsed, it decays within approximately five to twenty seconds. Short- 
term store limitations dictate that data not encoded and moved into long-term store must be 
overwritten to make room for new incoming stimuli (as when we forget a new phone number 
after hearing another series of numbers) or consciously rehearsed and then discarded 
immediately after use (as when we repeat a telephone number aloud until we have dialed it). 

Memories often seem to fade with the passage of time (Ebbinghaus, 1885/1913). 
Forgetting is a failure to retrieve information from long-term store. There are three general 
hypotheses about the factors that cause forgetting, each of which probably contributes to overall 
retrieval problems. The decay hypothesis asserts that the strength of a memory simply weakens 
over time and therefore is harder to retrieve (Wickelgren, 1976). The interference hypothesis 
claims that competition among memories blocks the retrieval of a target memory (Ausubel, 
Robbins, & Blake, 1957; Postman, 1961). The retrieval-cue hypothesis asserts that at the time of 
retrieval we lose access to the internal “indices” that point to the memory’s location in long-term 
store (Norman, 1969). There is some evidence to suggest that once information has been moved 
to long-term store, it remains there forever (Nelson, 1971, 1978). While this means memories 
may never actually leave long-term store, individuals certainly can lose access to them. 
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Important as the Atkinson-Shiffrin model has been for explaining and consolidating 
much of the existing data on human cognition, the model is not without its shortcomings. Several 
information-processing theorists contend that one particularly troublesome deficiency is the 
model’s unitary short-term store, which implies that input from each of the senses, or modalities, 
is processed along exactly the same route and in exactly the same way (see, for example, 
Bregman, 1990; Marr, 1982). If this were true, they argue, it would not be possible for people to 
process multiple input and output modalities simultaneously as they do. Studies by Baddeley and 
his colleagues indicate that there may be many different short-term stores — at least one per 
modality — each with its own strengths and weaknesses (see Baddeley, 1986, 1990; Baddeley & 
Hitch, 1974, 1977; Baddeley & Liebermann, 1980; Baddeley & Logie, 1992; Salame & 
Baddeley, 1982). This multi-store working memory concept may explain more accurately how 
each of the modalities, including sound, can have its own “specialty” and can be suited uniquely 
to its specific role in information-processing (Alten, 1999). 

The Role of Sound in Information Processing 

We know from our experience with loud, sudden alarms that sounds can be particularly 
demanding of our attention. Wickens (1984) observed that sounds are especially intrusive upon 
our consciousness because, unlike eyes, ears can never be averted or shut with “earlids.” In fact, 
research by Kohfeld (1971) and Posner, Nissen, and Klein (1976) has confirmed that sounds 
generally are more effective than images for gaining attention. But sounds evidently need not be 
alarming or startling to attract us. Some sounds — like a far-away baby’s cry or a flat tire’s faint 
thumping — so immediately activate existing images and schemas that they can be particularly 
effective in focusing our attention (Bernstein, Clark, & Edelstein, 1969a, 1969b, Bernstein & 
Edelstein, 1971). Other sounds — like waves hitting the shore or an inspirational Sousa march — 
can hold our attention by making our environment more tangible or by arousing our emotions 
(Thomas & Johnston, 1984). Thus, sounds not only gain attention, but also can help focus 
attention on appropriate information and keep distractions of competing stimuli at bay, engaging 
an individual’s interest over time. 
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Sounds supply us with volumes of complex information that we easily interpret in order 
to extrapolate important details about the world around us. Sounds can communicate information 
when visual attention is focused elsewhere, when tasks do not require constant visual 
monitoring, or when the visual channel is overburdened. In these ways, sounds can consolidate 
the information we might otherwise obtain visually to help us determine when to cross busy 
streets, to stop pouring liquids, and the like (McAdams, 1993). Further, sounds can elaborate 
upon visual stimuli by providing information about invisible structures, dynamic change, and 
abstract concepts almost impossible to communicate visually. Perkins (1983) noted, for example, 
that many drivers of manual transmission cars rely upon the sound of their car’s engine, not 
visual cues from a speedometer or tachometer, to decide when to shift gears. And Harmon 
(1988) suggested that without even having seen the performance, we know from thunderous 
applause a show was well received. Winn (1993) proposed that sounds form hierarchical clusters 
just as sights do; the difference is that sounds are organized in time, whereas images are 
organized in space. Take, for example, a situation in which five sound sources — a factory 
operating, a person speaking, a helicopter flying, a truck idling, and a motorcycle running — are 
producing sound simultaneously. Yost (1993) observed that organizational temporal clues within 
this composite of sounds allow most people almost instantly to ascertain that five sound sources 
are present, to determine each source’s identity, and to locate the sources spatially. Thus, sounds 
provide a context within which individuals can thinking actively about connections between and 
among new information. 

Gaver (1993a) asserted that when we hear the sound of a car while walking along a road 
at night, we are not likely to focus on the sound itself at all. Instead, we compare what we are 
hearing to our episodic and semantic memories for the sounds objects make, drawing from and 
linking to existing constructs and schemas in order to support our understanding of what is 
happening — and we step out of the car’s path. If, however, the same automobile sound were 
used in a cartoon to accompany a character’s non-automotive hasty retreat, we might instead 
build upon our understanding of the action by metaphorically depicting one event in terms of our 
existing knowledge of another event. The language we later use to describe these sounds 
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provides us with the means to discuss the experience with others and to transfer this knowledge 
to new situations where we can develop even deeper understandings. (Consider, for example, 
“The baby wailed like a siren;” “the mindless bureaucrat squawked like a parrot;” and “the 
coward squealed like a pig.”) Thus, sounds can tie into, build upon, and expand existing 
constructs in order to help relate new information to a larger system of conceptual knowledge. 

Information-processing theory addresses human cognition. Communication theory, on the 
other hand, addresses human interaction. As was the case with information-processing theory, 
one model — Shannon and Weaver’s The Mathematical Theory of Communication 
(1949/1969) — appears to have been particularly influential in shaping communication theory. 

The Shannon-Weaver Communication Model 

The Shannon-Weaver model proposes that all communication processes begin when a 
source, desiring to produce some outcome, chooses a message to be communicated. A 
transmitter then encodes the message to produce a signal appropriate for transmission over the 
channel that will be used. After the message has been transmitted, a receiver then decodes the 
message from the signal transmitted and passes it on to the destination. 

In person-to-person communication, where one individual performs both the message- 
creation and encoding functions and another individual performs both the message-decoding and 
receiving functions, it may be useful to refer to only a source and a receiver (see Hankersson, 
Harris, & Johnson, 1998; Newcomb, 1953). Further, while Shannon and Weaver defined a 
channel generally as any physical means by which a signal is transmitted, some theorists prefer 
to distinguish between the artificial technical channels of more mechanistic communication 
(such as telephones, films, and newspapers) and the natural sensory channels typical of human 
communication (such as seeing, hearing, touching, smelling, and tasting) (see Moles, 1958/1966; 
Travers, 1964b). According to the Shannon-Weaver model, however, whether technical or 
natural, all channels have limited capacity. In humans, channel capacity generally refers to the 
physiological and psychological limitations on the number of symbols or stimuli that individuals 
can process (Severin & Tankard, 1979). When more symbols are transmitted than a channel can 
handle, some information is lost. 
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The Shannon-Weaver model makes a sharp distinction between the information 
contained in a message and the meaning of a message. Messages are ordered sets of perceptual 
elements or cues drawn from a particular pool and assembled in a deliberate way (Potts, 1977). 
The model defines information as the freedom that a source has in choosing and putting together 
those message cues. In this sense, the concept of information applies not to the messages 
themselves, as is the case with meaning, but instead to the degrees of freedom in the situation as 
a whole. In other words, information is what a source could communicate, whereas meaning is 
what a source does communicate. Sometimes called the “surprisal” factor, information is that 
aspect of a message that removes or reduces uncertainty in the situation (Pask, 1975a). 

For example, in a simplified situation where messages are paired elements taken from the 
pool *♦ <?, and <a, the maximum number of potential element combinations, or “messages,” is 
sixteen: 



*0 


09 


9 A 


A 4* 


4*9 


0 £> 


94* 


A 0 


4*A 


0* 


90 


A9 


4*4* 


0 0 


9 9 


AA 



Assuming nothing is known of the source’s intent, from the receiver’s perspective there is a one- 
in-sixteen chance the source will assemble a particular message for communication. One might 
say that this probability is a measure of the level of uncertainty in the situation. When the source 
communicates “£)B,” the message resolves that uncertainty. If the number of elements in the pool 
is increased to five, the number of choices and, therefore, the level of uncertainty in the situation 
doubles to thirty-two. That means the same “(OB” message, when chosen from a five-symbol 
pool, clears up even more uncertainty. It is said to contain more information. Thus, the larger the 
pool of possible message elements from which to choose — that is, the greater the degrees of 
freedom in a situation — the smaller the probability that a particular message will be 
communicated. Stated differently, the more elements in the pool, the more uncertain is the 
situation and the more information-filled is the message ultimately communicated. While 
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“uncertainty” suggests future events and “information” past events, the property in question is 
the same. In communication theory, that property is referred to as entropy. 

According to Shannon and Weaver, communication is “perfect” when the information 
contained in a message affects the receiver in exactly the way intended by the source. 
Communication is rarely perfect, however; problems that cause discrepancies between the 
transmission and what is ultimately received can arise at any point. 

Problems in Communication 

Shannon and Weaver maintained that problems in communication occur when things get 
added to the signal that were not originally intended by the source. This spurious information, or 
noise, introduces errors that increase the uncertainty in the situation and make the signal harder 
for the receiver to reconstruct accurately. Whether noise originates in the channel, the receiver, 
or in the message itself, it limits the amount of desired information that can be communicated in 
a given situation within a given time. 

Shannon and Weaver divided the analysis of communication problems into three levels. 
“Level A” deals with how accurately the signal is received. When competing external or internal 
stimuli exist in a communication channel, the resulting noise introduces technical errors that can 
overpower all or part of a signal transmission. This disruption prevents the receiver from being 
able to select the communicated signal for decoding. No matter how accurately a message is 
transmitted, however, if it cannot be decoded by the receiver it is not likely to convey the 
intended message (Staniland, 1966). “Level B,” therefore, concerns how precisely the received 
signal conveys the intended message. Decoding requires the receiver to analyze an incoming 
signal based on his or her existing schemas (Crockett, 1988). When no interpretive framework 
exists and none is supplied by the source, the resulting noise introduces semantic errors that 
prevent the signal from conveying the intended message (Smith, 1995). Even when a message is 
interpreted correctly, it still may not accomplish the source’s goal (Petty, Cacioppo, & Kasmer, 
1988). Thus, “Level C” involves whether the received message ultimately produces the outcome 
desired by the source. To effect an outcome, the elements and structure of the message that 
assign connotative meaning — such as aesthetic appeal, style, execution, and other psychological 
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and emotional factors — must mesh with the receiver’s own relevant beliefs, cultural values, and 
experiences (Wyer & Gruenfeld, 1995). If this synthesis leads the receiver to make inferences 
about the message that are not intended by its source, the resulting noise introduces conceptual 
errors that can prevent the communication from producing the desired result (Bamlund, 1970). 

According to Shannon and Weaver these three communication levels are interrelated and 
interdependent. It becomes very difficult for a signal to convey its message (Level B) when there 
are errors in the signal’s transmission (Level A). Similarly, a message is unlikely to produce the 
desired outcome (Level C) if misinterpreted (Level B) or received inaccurately (Level A). Thus, 
Shannon and Weaver argued that improving the effectiveness and efficiency of the 
communication process overall requires applying concepts from their model to all three levels of 
communication problems. Pierce (1980) and others have agreed that efforts to find solutions to 
problems at levels B and C might be guided, at least analogically, by the same techniques that 
have proven effective at Level A (Darnell, 1972; Ruben, 1972; Schramm, 1955). 

The Role of Redundancy in Communication 

As a measure of the predictability or certainty in a situation, redundancy is the opposite 
of entropy. Redundancy is the information that cues share: the parts that “overlap.” In fact, the 
word “redundancy” is commonly defined as something that is superfluous or unnecessary 
(Hawkins & Allen, 1991). However, while redundancy between message cues can be omitted 
without losing any information, in communication systems the surplus may not necessarily be 
uncalled-for. Shannon and Weaver found that increasing the redundancy between message cues 
could help to offset noise in a communication system. 

Continuing the four-symbol example from above, consider a transmitted signal 
encountering noise that generates the following, error-filled message: 




Through the noise, the receiver can discern that the first cue is This information provides 
the receiver with enough certainty to eliminate three-quarters of the possible messages (all those 
that do not begin with the symbol). Nonetheless, because of the error, the receiver remains 
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one-quarter uncertain of the identity of the second cue and, hence, of the message. The second 
cue could be any one of the four symbolic possibilities. 

However, if the receiver knows that the source is combining only cues of the same color, 
the possibilities for the second cue are reduced to or “A.” Color redundancy between the 
message cues halves the receiver’s uncertainty about the identity of the second cue. Assume 
further that the receiver knows that the source is combining only cues of the same color and the 
same shape. Now, receiving only the first “8” through the noise is sufficient for the receiver also 
to know the identity of the second cue. Within this context, the message, despite its errors, 
resolves all of the receiver’s uncertainty. In other words, total redundancy between the message 
cues defeated the confounding effects of system noise. 

Although Shannon and Weaver confined their work primarily to the study of the Level A 
problems of mechanistic communication systems, redundancy might be applied to resolve 
problems at all three levels of human communication. For example, a source might attempt to 
correct technical problems in the system (Level A) by retransmitting or amplifying the signal. 
This content redundancy often can help overcome transmission errors by completing obstructed 
signals or by preventing the interference in the first place (Berlyne, 1957a, 1957b, 1957c, 1958; 
Miller, Heise, & Lichten, 1951). A source anticipating semantic problems in the system (Level 
B) might attempt to correct them by supplying the relevant connections between and among 
related message signals. This context redundancy often can help overcome misinterpretations by 
furnishing denotative meanings for signals (Abom & Rubenstein, 1952; Bateson, 1978; Berger, 
1987; Hewes, 1995; Rubenstein & Abom, 1954). A source might attempt to correct conceptual 
problems in the system (Level C) by carefully choosing signals that make appropriate links to 
receivers’ preexisting concepts in memory. This construct redundancy clarifies the connotative 
meanings behind message signals and reduces misunderstandings (Heit, 1997; Pask, 1975b). 

Various types of redundancy, therefore, may help to overcome the noise that can raise 
barriers at each level of communication. Redundancy that helps a receiver separate transmitted 
information from system noise increases understanding and is, therefore, desirable. However, 
redundancy not needed by the receiver or that fails to increase understanding can be a burden on 



Copyright © 2001 Mary Jean Bishop and Ward Mitchell Cates 



A Framework for Researching Sound’s Use 19 



the system. Leonard (1955) suggested that channel limits mean unnecessary redundancy may 
actually impede the flow of new information and, consequently, decrease communication 
effectiveness. When redundancy exists at the expense of new information, it can introduce its 
own sort of noise — boredom and fatigue induced by repetitiveness — into the system. Thus, 
while highly redundant messages can overcome noise in communications effectively, they are 
not very efficient (Reza, 1994). When a source anticipates noise at the various levels of 
communication, the trick may be in knowing how much and which sort of between-cue message 
redundancy to include in order to counteract noise. Striking the right balance between 
redundancy and entropy appears to be the key to successful communication (Krendl, Ware, Reid, 
& Warren, 1996). 

The Instructional Communication System 

Berio (1960) suggested that the study of learning processes and the study of 
communication processes differ only in their point of view. While learning models generally 
begin with and focus on how messages are received and processed by learners, communication 
models most often begin with and focus on how messages are sent. The instructional process, 
therefore, might be viewed as a communication system with a set of interrelated parts working 
together to produce learning. 

In an “ideal” instructional communication system, an educator selects an instructional 
message to communicate for student learning. Anticipating communication problems at each of 
the technical, semantic, and conceptual levels, he or she plans a lesson carefully by organizing 
and choosing the appropriate message cues based on the aptitudes and needs of the students. As 
the information to be learned is transmitted over the chosen channel, the student selects the 
lesson material from among the many competing internal and external environmental stimuli. 

The student then interprets the message signals in the way intended and analyzes the information 
selected, making the appropriate internal connections between and among related content points. 
Once the student has committed the information to memory, he or she retrieves the constructs 
necessary for understanding, relates the message to those deeper meanings, and synthesizes the 
new material into his or her existing knowledge. With these external connections established, the 
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instructional message is then committed to the student’s long-term memory, the effect desired by 
the educator. 

Unfortunately, one need not visit many classes before discovering that instructional 
communication systems rarely work ideally. While it is tempting to blame the problems within 
these systems on non-curricular pressures like divorce, drugs, teenage pregnancies, racism, and 
in-school violence, considerable theoretical work and research supports the argument that 
students’ difficulties are quite often due to failures in overcoming acquisition, processing, and 
retrieval “noise” in the instructional communication system (Glasser, 1969; Goodlad, 1984; Holt, 
1969; Rutter, Maughan, Montimore, Ouston, & Smith, 1979; Silberman, 1970). 

Noise in Instructional Communication 

Like other forms of communication, instructional communication systems can fail 
because of errors induced by excessive noise. Limitations within any of the three information- 
processing operations (acquisition, processing, and retrieval) can contribute to problems in 
instructional communication. Noise encountered within each operation is discussed below. 

Acquisition Noise 

In order to leam, students must first receive an instructional message accurately (Ormrod, 
1990). When competing external or internal stimuli exist in the channel the resulting acquisition 
noise may disrupt instructional signal selection. These technical errors often cause the learner to 
fail to attend to the communicated instructional material. 

For some students, acquisition noise originates from emotional, psychological, or 
physiological impairments that make it difficult, if not impossible, for them to concentrate 
(Johnston, 1998). But while some students’ acquisition problems do arise from more serious 
disorders, we know from experience that anyone can occasionally have trouble paying attention. 
When someone nudges us, when snow begins to fall outside, when the bell rings, or when the 
smell of pizza fills the air, we notice it and can become distracted (Glass, Holyoak, & Santa, 
1979). Similarly, thoughts that are more compelling than external instructional stimuli can act as 
competing internal stimuli. Such “normal” acquisition noise may be further intensified, Soloway 
(1991) and Healy (1990) argued, by today’s students’ growing expectation for constantly 
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changing, multi-sensory environments, an expectation likely based on their extensive experience 
with fast-paced television shows, music videos, and electronic games. Whatever the cause, it has 
been estimated that at any given time in the typical classroom as many as forty percent of the 
learners are not paying attention (White, 1992). And, according to many, without attention there 
is no learning (Anderson, 1995a, 1995b; Gagne & Driscoll, 1988; Grabe, 1986; Sirotnik, 1982). 

Processing Noise 

Unless a learner can decode an instructional signal, the signal is not likely to convey its 
message. That is, learning is more securely established when an instructional signal, the material 
of learning, can be broken down into its constituent parts, relationships between those parts made 
explicit, and the organizational principles and structures of the combined parts recognized 
(Bloom, 1956). However, when learners have no way to interpret a signal, the resulting 
processing noise may distort instructional message analysis. These semantic errors can cause the 
learner to misinterpret the communicated instructional material. 

Processing noise occurs when the learner lacks the cognitive skills to analyze an 
instructional message and no assistance is supplied by the source (Glover, Ronning, & Bruning, 
1990). Students exhibit differences in their competence and predisposition to employ the 
information analysis strategies necessary to overcome processing noise. Harris (1982) claimed 
that most learners exercise rather poor information analysis techniques and Smirnov and 
Zinchenko (1969) found that some learners, particularly younger ones, lack the cognitive 
capacity to engage purposefully in deliberate rehearsal and encoding behavior. While many 
learners are able to engage in voluntary processing activities, they typically still use only 
repetition or rehearsal techniques to forestall short-term memory decay (Brown, 1977; Flavell, 
1979; Flavell & Wellman, 1977). Paris (1978) reported that learners exposed to more effective 
analysis strategies still often do not make the connection between the processing techniques 
(means) and the learning task (end). Few of these students, consequently, appear motivated to 
adopt a more active approach to processing lesson content, even though processing is the critical 
stage at which incoming stimuli are transformed into memorable knowledge (Gagne, 1985). 
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Retrieval Noise 

If the elements and structure of an instructional message do not mesh artfully with the 
receiver’s constructs and personal experiences, the message is unlikely to produce the desired 
outcome. That is, learning is accomplished most effectively when an instructional message 
triggers links to existing knowledge (Resnick, 1989). When the message fails to establish links 
supplying the correct framework for understanding and schema building, the resulting retrieval 
noise may discourage instructional message synthesis. These conceptual errors can cause 
learners to misunderstand the broader, connotative meaning of instructional material. 

Retrieval noise occurs whenever a learner has trouble matching the retrieval cues used in 
a message with applicable constructs in long-term memory (Thomson & Tulving, 1970; Tulving 
& Thomson, 1973). There are many ways retrieval noise can occur. A learner simply may not 
possess the knowledge prompted by the message’s retrieval cues. A learner may mistakenly 
retrieve an inappropriate knowledge structure. A learner may have forgotten the target memory. 
Regardless of why retrieval noise occurs, it seems that in instructional communication systems 
retrieval from long-term store can sometimes be slow, effortful, and even unsuccessful. As a 
result, some learners fail to recognize how new information fits with their larger conceptual 
knowledge and even that there is a deeper meaning behind the concepts presented in a lesson. 
Phases of Instructional Communication System Problems 

Like the three levels of communication problems (technical, semantic, and conceptual), 
the nature of an instructional communication problem varies depending upon the level or 
“phase” of learning — selection, analysis, and synthesis. 

During selection the learner receives the instructional signal. “Technical” difficulties at 
this phase arise when the signal cannot overpower competing external and internal stimuli in the 
channel and cause message-transmission problems. Defeating selection-phase instructional 
communication problems requires that the learner direct attention to the signal, isolate and 
disambiguate the signal from the surrounding stimuli, and relate the incoming signal to some 
existing schema in memory. In other words, the learner must be sufficiently interested in the 
message and be physically able to select it for further processing. 
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In the analysis phase, the learner decodes the received signal based on his or her existing 
schemas and any clues provided from within the signal itself. “Semantic” difficulties in the 
analysis phase stem from the noise created by missing interpretive frameworks, which cause 
message-interpretation problems. Overcoming analysis-phase problems calls for the learner to 
focus attention on the message, organize and categorize the information contained in the 
message, and use the information contained in the message to build upon existing knowledge. 
Stated differently, in order to decode and encode it for long-term memory storage, the learner 
must be actively curious about the material under study and possess the interpretive framework 
necessary for appropriate analysis. 

During synthesis the learner internalizes and reacts to the decoded message in the way 
intended by the source. “Conceptual” difficulties in this learning phase occur when prompts in 
the message do not match the learner’s existing schema and cause misunderstandings. Defeating 
synthesis-phase problems requires that the learner sustain attention over time, elaborate upon the 
information contained in the instructional message, and use the information contained therein to 
construct transferable knowledge structures. That is, in order to process the material more 
deeply, the learner must be engaged in the instructional message and appropriately synthesize the 
concepts conveyed with his or her existing schemas. 

Acquisition, processing, and retrieval operations are all applied — in varying amounts — 
during each phase of learning. During selection, processing calls upon acquisition heavily; in 
contrast, only the most salient memories are retrieved during selection. During analysis, 
processing is central — although acquisition and retrieval are also relatively active. During 
synthesis, processing calls upon retrieval most heavily, while only the most salient new stimuli 
are acquired. Table 1 depicts the orthogonal relationship between the selection, analysis and 
synthesis phases of learning and the acquisition, processing, and retrieval information-processing 
operations. 

[INSERT TABLE 1 ABOUT HERE.] 

The rows in Table 1 (running horizontally from left to right) represent the learner’s 
selection, analysis, and synthesis phases while the columns (running from top to bottom) 
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represent the learner’s acquisition, processing, and retrieval operations. The black labels across 
the top of the table indicate the general nature of the channel noise that might be produced. The 
white cells at the intersections of the learning phases and information-processing operations 
detail the specific nature of the acquisition, processing, or retrieval noise that a message is likely 
to encounter at that juncture. 

When tracing the cells horizontally across the learning phases, one finds that if an 
instructional message fails to gain attention, is not sufficiently salient for the learner to isolate it 
from among the many stimuli encountered, and does not evoke the existing schema(s) from 
memory, the learner is not likely to be interested. Similarly, if an instructional message fails to 
focus attention, neglects to help organize the information it contains, and does not help build 
upon existing knowledge, the learner is not likely to be curious. Lastly, if an instructional 
message fails to hold attention over time, does not help elaborate upon the information it 
contains, and does not support efforts to construct transferable knowledge structures, the learner 
is not likely to be engaged. 

Following the cells vertically down the information-processing operations, it appears that 
the relative “strength” of potential noise increases and the ultimate consequences of that noise 
become more serious at each “deeper” (top to bottom) phase of learning. Thus, instructional- 
message transmission problems at each phase are constituted by the learner’s deepening 
attentional difficulties (column 1). Message-interpretation problems at each phase consist of the 
learner’s intensifying trouble with information manipulation (column 2). And message- 
understanding problems at each phase are compounded by the learner’s advancing problems in 
connecting the new information to existing schemas (column 3). 

When there is a discrepancy between the outcome a system was designed to achieve and 
what that system actually accomplishes, modifications must be made to improve that system’s 
performance (Kidd & Van Cott, 1972). Adding redundancy to instructional messages may help 
to overcome the acquisition, processing, and retrieval noise that learners encounter at the 
selection, analysis, and synthesis phases of instructional communication. Thus, improving 
instructional communication performance within the system’s channel capacity limits may 



Copyright © 2001 Mary Jean Bishop and Ward Mitchell Cates 



A Framework for Researching Sound’s Use 25 



require finding new perceptual message cue combinations that can both deliver sufficient 
amounts of new information and supply the noise-defeating content, context, and construct 
redundancy necessary to enhance learning. 

The Role of Multi-cue Messages in Instructional Communication Systems 

For some time it has been thought that simply adding cues to messages might improve 
the effectiveness of instructional communication. Hoban (1949) and others hypothesized that the 
more cues used, whether within or across sensory channels, the greater the amount of 
information communicated and the more learning gained (see also, Clark, 1932; Einbecker, 

1933; Hansen, 1936; Miller, 1957; Westfall, 1934). While the results of these cue summation 
studies appear contradictory on the surface, Severin (1967a) maintained the differences might be 
explained by the degree of redundancy among cues used in the treatments. Severin noted that 
studies that found no difference between multiple-cue and single-cue communication used cues 
that were almost totally redundant, such as text coupled with word-for-word narration. In these 
studies the wedded cues apparently neither competed with each other nor supplied new 
information (see MacKay, 1973; Severin, 1967b; Travers, 1964a, 1964b; Van Mondfrans & 
Travers, 1964). In contrast, studies that found multiple-cue communications to be less effective 
than single-cue communications used cues with no redundancy between them, such as text 
coupled with unrelated speech. In these studies, the dueling cues probably exceeded channel 
capacity, producing noise that decreased communication efficiency (see Boring, 1950; 

Carpenter, 1953; Cherry & Taylor, 1953; Hemandez-Peon, 1961; Spaulding, 1956). Severin 
concluded that studies that found multi-cue communications to be more effective than single-cue 
communications used cues that were partially redundant, like pictures coupled with related 
narration. In these studies, primary and secondary cues appear offset just enough for the 
secondary cue to supply the right balance of redundancy and new information (see Hartman, 
1961a, 1961b; Ketcham & Heath, 1962; Kramer & Lewis, 1951; Lumsdaine & Gladstone, 1958). 

Severin contended that multi-cue messages can be designed to help improve instructional 
communication. The question is not just whether the message contains multiple cues, but how 
useful the secondary cues are to the receiver and whether the message exceeds channel capacity. 
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McAdams and Bigand (1993) argued that sound is uniquely suited to assist in the acquisition, 
processing, and retrieval of new information for those who are not hearing-impaired. If this is 
true, multi-cue messages incorporating sounds may help achieve the optimal redundancy-entropy 
balance and offset noise in instructional communication systems. That is, there may be 
systematic ways to design sounds so that instructional messages supply the content, context, and 
construct redundancy necessary to optimize learning. 

How Sound Might Help to Optimize the Instructional Communication System 

Recall from the earlier discussion that sounds serve three main functions in information 
processing: They gain, focus, and hold our attention over time, helping us to receive messages 
accurately. They consolidate, organize, and elaborate upon the information we gather from our 
environments, helping us interpret messages. And they tie into our previous understandings, 
build upon existing schemas, and supply transferable knowledge structures, helping us to 
understand messages. Given these facts, it seems that using sound might help instructional 
designers to address potential instructional communication problems. 

[INSERT TABLE 2 ABOUT HERE.] 

Table 2 reorganizes Table 1 slightly and fills in suggestions for how sound might be used 
in each cell; the aim is to use sound to “enrich” instructional messages with the redundancy 
necessary to overcome that cell’s noise potential. When one traces the white cells horizontally 
across the learning phases, the framework suggests that a learner’s interest may be captured by 
an instructional message that employs sound to increase novelty, to make the message salient, 
and to appeal to existing schemas. Further, a learner’s curiosity may be aroused by an 
instructional message that uses sound to point out where to exert information-processing effort, 
to differentiate between and systematize content points and main ideas, and to help situate the 
material under study within real-life or metaphorical scenarios. Similarly, a learner’s level of 
engagement may be increased by an instructional message that utilizes sound to make the lesson 
more relevant, to supply elaborative auditory images and mental models, and to help learners 
transfer the material under study by building transferable structures that may be useful in 
subsequent learning. 
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Once again, in Table 2 the nature of the auditory redundancy stimuli “intensifies” with 
each “deeper” learning phase. Thus, sound’s content redundancy contributions to the 
instructional message address the learner’s deepening attentional difficulties at each of the three 
learning phases (column 1). Similarly, sound’s context redundancy contributions to the 
instructional message are intended to remediate the learner’s intensifying trouble with 
information manipulation (column 2). Finally, sound’s construct redundancy contributions to the 
instructional message are aimed at ameliorating the learner’s advancing problems in connecting 
the new information to existing schemas (column 3). 

If sound, like any design element, has a larger role to play in instructional software, its 
use should be grounded in helping students acquire, process, and retrieve the material under 
study. Systematically adding auditory cues to instructional messages in this way might enhance 
learning by anticipating learner difficulties and suppressing them before they occur. The 
framework presented above suggests a wide range of interesting research questions and 
establishes the boundaries of a fertile territory for empirical investigation. 

Summary 

While it appears that humans rely heavily upon sound to learn about their environments, 
instructional designers often make little use of auditory information in their computerized 
lessons. The prevailing attitude seems to be that, after all of an instructional software product’s 
visual requirements are satisfied, the designer might then consider adding a few sounds in order 
to gain the learner’s attention from time to time. If instructional multimedia software were a 
train, sound would be its caboose — bringing up the rear, put in place last, and often serving no 
obvious purpose beyond “bells and whistles.” This neglect of the auditory sense appears to be 
less a matter of choice and more a matter of just not knowing how to “sonify” instructional 
designs to enhance learning. More extensive use of sound may someday lead to more effective 
computer-based learning materials; but only if designers understand the cognitive components of 
sound’s use and the ways in which sound can contribute to appropriate levels of redundancy and 
entropy in instructional messages. 
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According to system optimization theory, once the theoretical foundation has been laid 
and a framework for sound’s use established, the next step in a systematic inquiry into sound’s 
optimal use involves exploring sound’s effectiveness through an iterative process of software 
development and modification, data collection and analysis, theoretical refinement, and product 
revisions (Kidd & Cott, 1972; Savenye & Robinson, 1996; Wilde & Beightler, 1967). It is our 
hope that this paper lays the theoretical foundation and provides a framework for a program of 
research on instructional software’s use of sound. 
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Table 1. The orthogonal relationship between learning phases and information-processing operations. 
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